Text Readability Classification of Textbooks of a Low-Resource Language
نویسندگان
چکیده
There are many languages considered to be low-density languages, either because the population speaking the language is not very large, or because insufficient digitized text material is available in the language even though millions of people speak the language. Bangla is one of the latter ones. Readability classification is an important Natural Language Processing (NLP) application that can be used to judge the quality of documents and assist writers to locate possible problems. This paper presents a readability classifier of Bangla textbook documents based on information-theoretic and lexical features. The features proposed in this paper result in an F-score that is 50% higher than that for traditional readability formulas.
منابع مشابه
Cohesive Readability of Expository Texts and Reading Comprehension Performance: Iranian EFL students of Different Proficiency Levels in Focus
Abstract The present study is an attempt to investigate the relationship between cohesive readability of expository texts and reading comprehension in EFL students with different proficiency levels. One hundred students formed the participant of this study. They were undergraduate students majoring in English at University of Isfahan. To collect the relevant data, participants were divide...
متن کاملCohesive Readability of Expository Texts and Reading Comprehension Performance: Iranian EFL students of Different Proficiency Levels in Focus
Abstract The present study is an attempt to investigate the relationship between cohesive readability of expository texts and reading comprehension in EFL students with different proficiency levels. One hundred students formed the participant of this study. They were undergraduate students majoring in English at University of Isfahan. To collect the relevant data, participants were divide...
متن کاملEFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series
This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...
متن کاملEFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series
This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...
متن کاملQualitative and Quantitative Examination of Text Type Readabilities: A Comparative Analysis
This study compared 2 main approaches to readability assessment. Thequantitative approach applied idea density based on part of speech tagging andcompared 3 sets of text types (i.e., narrative, expository, and argumentative) withrespect to their ease of reading. The qualitative approach was done throughdeveloping questionnaires measuring intermediate EFL learners’ perceptions oncontent, motivat...
متن کامل